Speculative Parallelization in Decoupled Look-ahead Architectures
نویسندگان
چکیده
One well known approach to mitigate the impact of branch mispredictions and cache misses is to enable deep lookahead so as to overlap instruction and data supply with instruction processing. A continuous look-ahead process which uses separate thread of control on another hardware contexts is one such approach which we call decoupled look-ahead [1], [2]. However, in such look-ahead schemes, look-ahead thread can often become the performance bottleneck. In this work, we explore speculative parallelization in a decoupled look-ahead agent. Intuitively, speculative parallelization is aptly suited to the task of speeding up look-ahead agent for two reasons. First, the program slice for look-ahead does not contain all the data dependencies embedded in the original program, providing more opportunities for parallelization. Second, the execution of the slice is only for look-ahead purposes and thus the environment is inherently more tolerant of dependence violations.
منابع مشابه
Accelerating Decoupled Look-ahead to Exploit Implicit Parallelism
Despite the proliferation of multi-core and multi-threaded architectures, exploiting implicit parallelism for a single semantic thread is still a crucial component in achieving high performance. While a canonical out-of-order engine can effectively uncover implicit parallelism in sequential programs, its effectiveness is often hindered by instruction and data supply imperfections (manifested as...
متن کاملAccelerating Decoupled Look-ahead via Weak Dependence Removal: A Metaheuristic Approach – Technical Report∗
Despite the proliferation of multi-core and multi-threaded architectures, exploiting implicit parallelism for a single semantic thread is still a crucial component in achieving high performance. Look-ahead is a tried-and-true strategy in uncovering implicit parallelism, but a conventional, monolithic out-of-order core quickly becomes resource-inefficient when looking beyond a small distance. A ...
متن کاملSynchronization on Speculative Parallelization of Many-Particle Collision Simulation∗
High performance particle simulations are on demand. As more and more computers switch from the traditional uniprocessor architectures to multi-core and multiprocessor parallel architectures, many computer simulations will be migrated to parallel environments to shorten the execution time of simulation. A recently developed speculative parallelization for many-particle collision simulation show...
متن کاملIntelligent Speculation for Pipelined Multithreading
In recent years, microprocessor manufacturers have shifted their focus from single-core to multi-core processors. Since many of today’s applications are single-threaded and since it is likely that many of tomorrow’s applications will have far fewer threads than there will be processor cores, automatic thread extraction is an essential tool for effectively leveraging today’s multi-core and tomor...
متن کاملAsap: Automatic Speculative Acyclic Parallelization for Clusters
While clusters of commodity servers and switches are the most popular form of large-scale parallel computers, many programs are not easily parallelized for clusters due to high internode communication cost and lack of globally shared memory. Speculative Decoupled Software Pipelining (Spec-DSWP) is a promising automatic parallelization technique for clusters that speculatively partitions a loop ...
متن کامل